Risk Identification in Driving Scenarios via Fusion of Scene Graph Embeddings and Optical Flow Features
XIAO Yao1, YANG Yijian1, GOU Chao1
1. Guangdong Provincial Key Laboratory of Intelligent Transportation Systems, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen 518107
Abstract:The spatiotemporal and behavioral interactions of multimodal traffic participants are complex and difficult to recognize accurately. Therefore, the difficulty of driving risk identification is increased. To address this issue, a virtual traffic scene graph dataset, CARLA_242, is constructed for collision risk assessment. The dataset contains seven types of traffic participants and sixteen types of scene graph relations. A risk identification method via fusion of scene graph embeddings and optical flow features is proposed. The method consists of three core modules. In the spatial modeling module, node features and relation information are first jointly encoded by a multi-relational graph convolutional network and then exploited to obtain scene graph embeddings through graph pooling and readout operations. In the optical flow extraction module, optical flow is estimated from video sequences, and optical flow features representing dynamic motion are extracted. In the spatiotemporal modeling module, the fused representations of scene graph embeddings and optical flow features are processed by a temporal transformer encoder for temporal modeling to achieve driving risk identification. Experiments demonstrate the superior performance of the proposed method on three scene graph datasets. The results validate the effectiveness of multimodal fusion of scene graph and optical flow features for driving risk identification.
[1] YU S Y, MALAWADE A V, MUTHIRAYAN D, et al. Scene-Graph Augmented Data-Driven Risk Assessment of Autonomous Vehi-cle Decisions. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(7): 7941-7951. [2] MYLAVARAPU S, SANDHU M, VIJAYAN P, et al. Understan-ding Dynamic Scenes Using Graph Convolution Networks // Proc of the IEEE/RSJ International Conference on Intelligent Robots and Systems. Washington, USA: IEEE, 2020: 8279-8286. [3] ZHOU Y C, GOU C, GUO Z P, et al. Behavior-Aware Knowledge-Embedded Model for Driver Attention Prediction. IEEE Transactions on Circuits and Systems for Video Technology, 2025. DOI: 10.1109/TCSVT.2025.3565410. [4] ZHOU Y C, TAN G, ZHONG R, et al. PIT: Progressive Interaction Transformer for Pedestrian Crossing Intention Prediction. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(12): 14213-14225. [5] LI X, WANG K F, GU X F, et al. ParallelEye Pipeline: An Effective Method to Synthesize Images for Improving the Visual Intelligence of Intelligent Vehicles. IEEE Transactions on Systems, Man, and Cybernetics(Systems), 2023, 53(9): 5545-5556. [6] 王淳浩,闭家铭,阮利,等. 自动驾驶安全关键场景生成技术综述. 信息与控制, 2024, 53(1): 17-32, 46. (WANG C H, BI J M, RUAN L, et al. Survey on Automatic Dri-ving Safety-Critical Scenario Generation Technology. Information and Control, 2024, 53(1): 17-32, 46.) [7] GOU C, ZHOU Y C, XIAO Y, et al. Cascade Learning for Driver Facial Monitoring. IEEE Transactions on Intelligent Vehicles, 2023, 8(1): 404-412. [8] 高兴波,史旭华,葛群峰,等. 面向动态物体场景的视觉SLAM综述. 机器人, 2021, 43(6): 733-750. (GAO X B, SHI X H, GE Q F, et al. A Survey of Visual SLAM for Scenes with Dynamic Objects. ROBOT, 2021, 43(6): 733-750.) [9] 朱增乐,魏智伟,张荣庆,等. 面向道路目标检测的多模态融合语义传输. 模式识别与人工智能, 2023, 36(11): 1009-1018. (ZHU Z L, WEI Z W, ZHANG R Q, et al. Multimodal Fusion-Based Semantic Transmission for Road Object Detection. Pattern Recognition and Artificial Intelligence, 2023, 36(11): 1009-1018.) [10] 苏卫星,赵晓雯,温永刚,等. 基于环境风险的自动驾驶局部路径规划算法. 信息与控制, 2023, 52(3): 369-381. (SU W X, ZHAO X W, WEN Y G, et al. Local Path Planning Algorithm for Autonomous Driving Based on Environmental Risk. Information and Control, 2023, 52(3): 369-381.) [11] YURTSEVER E, LIU Y K, LAMBERT J, et al. Risky Action Recognition in Lane Change Video Clips Using Deep Spatiotemporal Networks with Segmentation Mask Transfer // Proc of the IEEE Intelligent Transportation Systems Conference. Washington, USA: IEEE, 2019: 3100-3107. [12] JOHNSON J, KRISHNA R, STARK M, et al. Image Retrieval Using Scene Graphs // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2015: 3668-3678. [13] SCHLICHTKRULL M, KIPF T N, BLOEM P, et al. Modeling Relational Data with Graph Convolutional Networks // Proc of the 15th International Conference on Semantic Web. Berlin, Germany: Springer, 2018: 593-607. [14] LIU X X, ZHOU Y C, GOU C. Learning from Interaction-Enhanced Scene Graph for Pedestrian Collision Risk Assessment. IEEE Transactions on Intelligent Vehicles, 2023, 8(9): 4237-4248. [15] LI H Y, SIMA C, DAI J F, et al. Delving into the Devils of Bird's-Eye-View Perception: A Review, Evaluation and Recipe. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46(4): 2151-2170. [16] SONG Z H, HE Z M, LI X Y, et al. Synthetic Datasets for Auto-nomous Driving: A Survey. IEEE Transactions on Intelligent Vehicles, 2024, 9(1): 1847-1864. [17] MALIK S, KHAN M A, EL-SAYED H. CARLA: Car Learning to Act-An Inside Out. Procedia Computer Science, 2022, 198: 742-749. [18] KUNG C H, YANG C C, PAO P Y, et al. RiskBench: A Scena-rio-Based Benchmark for Risk Identification // Proc of the IEEE International Conference on Robotics and Automation. Washington, USA: IEEE, 2024: 14800-14807. [19] SONG X C, KANG M, ZHOU S P, et al. Pedestrian Intention Prediction Based on Traffic-Aware Scene Graph Model // Proc of the IEEE/RSJ International Conference on Intelligent Robots and Systems. Washington, USA: IEEE, 2022: 9851-9858. [20] ZHOU Y C, LIU X X, GUO Z P, et al. HKTSG: A Hierarchical Knowledge-Guided Traffic Scene Graph Representation Learning Framework for Intelligent Vehicles. IEEE Transactions on Intelligent Vehicles, 2024. DOI: 10.1109/TIV.2024.3384989. [21] ZHANG Z X, ZHANG C, LIU Y H, et al. A Bottom-Up Paradigm for Traffic Scene Graph Representation // Proc of the 9th International Conference on Computing and Pattern Recognition. New York, USA: ACM, 2020: 361-368. [22] ZIPFL M, ZÖLLNER J M. Towards Traffic Scene Description: The Semantic Scene Graph // Proc of the IEEE 25th International Conference on Intelligent Transportation Systems. Washington, USA: IEEE, 2022: 3748-3755. [23] MALAWADE A V, YU S Y, HSU B, et al. ROADSCENE2VEC: A Tool for Extracting and Embedding Road Scene-Graphs. Know-ledge-Based Systems, 2022, 242. DOI: 10.1016/j.knosys.2022.108245. [24] NGUYEN T T, NGUYEN P, COTHREN J, et al. HyperGLM: HyperGraph for Video Scene Graph Generation and Anticipation // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2025: 29150-29160. [25] IM J, NAM J Y, PARK N, et al. EGTR: Extracting Graph from Transformer for Scene Graph Generation // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Wa-shington, USA: IEEE, 2024: 24229-24238. [26] CHEN M, LI L L, WANG W G, et al. DIFFVSGG: Diffusion-Driven Online Video Scene Graph Generation // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2025: 29161-29172. [27] WU S Q, FEI H, CHUA T S. Universal Scene Graph Generation // Proc of the IEEE/CVF Conference on Computer Vision and Pa-ttern Recognition. Washington, USA: IEEE, 2025: 14158-14168. [28] LEFÈVRE S, VASQUEZ D, LAUGIER C. A Survey on Motion Prediction and Risk Assessment for Intelligent Vehicles. ROBOMECH Journal, 2014, 1: 1-14. [29] FU Y C, LI C L, LUAN T H, et al. Infrastructure-Cooperative Algorithm for Effective Intersection Collision Avoidance. Transportation Research Part C(Emerging Technologies), 2018, 89: 188-204. [30] KATRAKAZAS C, QUDDUS M, CHEN W H. A New Integrated Collision Risk Assessment Methodology for Autonomous Vehicles. Accident Analysis & Prevention, 2019, 127: 61-79. [31] WU B, YAN Y, NI D H, et al. A Longitudinal Car-Following Risk Assessment Model Based on Risk Field Theory for Autonomous Vehicles. International Journal of Transportation Science and Technology, 2021, 10(1): 60-68. [32] 唐伟文,郭晟楠,陈炜,等. 融合时序知识图谱的路段级交通事故风险预测. 模式识别与人工智能, 2023, 36(8): 721-732. (TANG W W, GUO S N, CHEN W, et al. Road Level Traffic Accident Risk Prediction by Incorporating Temporal Knowledge Graph. Pattern Recognition and Artificial Intelligence, 2023, 36(8): 721-732.) [33] SCHOONBEEK T J, PIVA F J, ABDOLHAY H R, et al. Learning to Predict Collision Risk from Simulated Video Data // Proc of the IEEE Intelligent Vehicles Symposium. Washington, USA: IEEE, 2022: 943-951. [34] KARIM M M, YIN Z Z, QIN R W. An Attention-Guided Multi-stream Feature Fusion Network for Early Localization of Risky Tra-ffic Agents in Driving Videos. IEEE Transactions on Intelligent Vehicles, 2024, 9(1): 1792-1803. [35] LIU X X, ZHOU Y C, YE Y Q, et al. Edge Feature-Enhanced Network for Collision Risk Assessment Using Traffic Scene Graphs. IEEE Intelligent Transportation Systems Magazine, 2025, 17(2): 23-32. [36] 苟超,刘欣欣,郭子鹏,等. 无人驾驶突发紧要场景下基于平行视觉的风险增强感知方法. 中国图象图形学报, 2024, 29(11):3265-3279. (GOU C, LIU X X, GUO Z P, et al. Enhanced Risk Perception Method Based on Parallel Vision for Autonomous Vehicles in Safety-Critical Scenarios. Journal of Image and Graphics, 2024, 29(11): 3265-3279.) [37] TEED Z, DENG J. RAFT: Recurrent All-Pairs Field Transforms for Optical Flow // Proc of the 16th European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 402-419. [38] JIANG S H, CAMPBELL D, LU Y, et al. Learning to Estimate Hidden Motions with Global Motion Aggregation // Proc of the IEEE/CVF International Conference on Computer Vision. Wa-shington, USA: IEEE, 2021: 9752-9761. [39] HUANG Z Y, SHI X Y, ZHANG C, et al. FlowFormer: A Transformer Architecture for Optical Flow // Proc of the 18th European Conference on Computer Vision. Berlin, Germany: Springer, 2022: 668-685. [40] MORIMITSU H, ZHU X B, JI X Y, et al. Recurrent Partial Kernel Network for Efficient Optical Flow Estimation. Proceedings of the AAAI Conference on Artificial Intelligence, 2024, 38(5): 4278-4286. [41] YUAN S, LUO L, HUI Z, et al. UnSAMFlow: Unsupervised Optical Flow Guided by Segment Anything Model // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2024: 19027-19037. [42] LUO A, LI X, YANG F, et al. FlowDiffuser: Advancing Optical Flow Estimation with Diffusion Models // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2024: 19167-19176. [43] DONG Q L, FU Y W. MemFlow: Optical Flow Estimation and Pre-diction with Memory // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2024: 19068-19078. [44] LEE J, LEE I, KANG J. Self-Attention Graph Pooling // Proc of the 36th International Conference on Machine Learning.San Francisco, USA: Morgan Kaufmann, 2019: 3734-3743. [45] HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 770-778. [46] FANG J W, LI L L, YANG K, et al. Cognitive Accident Prediction in Driving Scenes: A Multimodality Benchmark[C/OL].[2025-07-25]. https://arxiv.org/pdf/2212.09381. [47] RASOULI A, KOTSERUBA I, TSOTSOS J K. Are They Going to Cross? A Benchmark Dataset and Baseline for Pedestrian Crosswalk Behavior // Proc of the IEEE International Conference on Compu-ter Vision Workshops. Washington, USA: IEEE, 2017: 206-213. [48] FANG J W, YAN D X, QIAO J H, et al. DADA: Driver Attention Prediction in Driving Accident Scenarios. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(6): 4959-4971. [49] GAO H, WANG B F, LIU X X, et al. ynamic Attention-Enhanced Spatio-Temporal Network for Pedestrian Collision Risk Assessment // Proc of the 7th Chinese Conference on Pattern Re-cognition and Computer Vision. Berlin, Germany: Springer, 2024, X: 207-221.